{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "toc": true
   },
   "source": [
    "<h1>Table of Contents<span class=\"tocSkip\"></span></h1>\n",
    "<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Set-up-directories\" data-toc-modified-id=\"Set-up-directories-1\"><span class=\"toc-item-num\">1&nbsp;&nbsp;</span>Set up directories</a></span></li></ul></div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This notebook identifies and sums up all periodicals/journals classified under the Library of Congress subject headlines “social sciences” or “statistics” in circulation in any year during the nineteenth century using previously downloaded bibliographic records (RIS files) from OLCL WorldCat.\n",
    "\n",
    "The query at WorldCat returned a total of 8,928 catalogue records. I downloaded each in collections of 100 (i.e. 89 RIS files) and then parse here the files to create a data frame containing relevant identifying information (e.g., journal title, language, years in print, publication place). After removing duplicates and catalogue records with unknown start or end years. (e.g., “1840–1900s” or “1800s–1907), the final universe of nineteenth-century social science and statistics periodicals/journals in circulation at any time between 1800–1914 was 5,196."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Import standard libraries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-09-17T13:56:24.070095Z",
     "start_time": "2021-09-17T13:56:23.845425Z"
    }
   },
   "outputs": [],
   "source": [
    "import glob \n",
    "import csv\n",
    "import pandas as pd\n",
    "import os"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "pd.set_option('mode.chained_assignment', None)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Set up directories"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ensure CWD is the scripts folder of the rep directory"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-09-17T13:56:25.876947Z",
     "start_time": "2021-09-17T13:56:25.869001Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'/Volumes/GoogleDrive-107745440581041782819/My Drive/00_Researching/16_SocialScientization/-04_EJS/00_replication/01_scripts'"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "os.getcwd()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-09-17T13:56:28.053985Z",
     "start_time": "2021-09-17T13:56:28.051361Z"
    }
   },
   "outputs": [],
   "source": [
    "directory = os.path.dirname(os.getcwd()) + \"/\"\n",
    "data = directory + \"00_data/\"\n",
    "ivs = data + \"01_ivs/\"\n",
    "journals_folder = ivs + \"worldcat_ris_files/\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Parse each bibliographic data file \n",
    "Loop thru each RIS file from worldcat, which contains about 100 records corresponding to in circulation periodicals. Extract relevant information."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-09-17T13:56:51.681280Z",
     "start_time": "2021-09-17T13:56:51.361318Z"
    }
   },
   "outputs": [],
   "source": [
    "record_split = \"-\"*200\n",
    "files = glob.glob(journals_folder + \"DirectExport*\")\n",
    "with open(ivs + \"19c_journals.csv\", 'w') as csvf: \n",
    "    writer = csv.writer(csvf)\n",
    "    writer.writerow(['title', 'auth', 'lang', 'year', 'start', 'end', 'soc', 'stats'])\n",
    "    for file in files:\n",
    "        with open(file, 'r') as f:\n",
    "            records = f.read()\n",
    "            records = records.split(record_split)\n",
    "            records.pop()\n",
    "            record_n = -1\n",
    "            for record in records: \n",
    "                record_n += 1\n",
    "                line_n = -1\n",
    "                lines = record.split('\\n')\n",
    "                title = \"\"\n",
    "                year = \"\"\n",
    "                auth = \"\"\n",
    "                lang = \"\"\n",
    "                for line in lines:\n",
    "                    line_n += 1\n",
    "                    if \"Title:\" in line:\n",
    "                        title = line.split(\"Title:\")[-1]\n",
    "                        title = title.strip()\n",
    "                        title = title.replace(\"         \", \" \")\n",
    "                        if \":\" in title: \n",
    "                            title += lines[line_n + 1]\n",
    "                    if \"Year:\" in line: \n",
    "                        year = line.split(\":\")[-1].strip()\n",
    "                        if (\"s\" or \"?\") in year: \n",
    "                            start = -99\n",
    "                            end = - 99\n",
    "                        elif \"-\" in year: \n",
    "                            start = year.split(\"-\")[0]\n",
    "                            if len(year.split(\"-\")[1]) == 4:\n",
    "                                end = year.split(\"-\")[1]\n",
    "                            else: \n",
    "                                end = 1914\n",
    "                        else: \n",
    "                            start = year\n",
    "                            end = year\n",
    "                    if \"Corp Author(s):\" in line:\n",
    "                        auth = line.split(\":\")[-1].strip()\n",
    "                    if \"Language:\" in line: \n",
    "                        lang = line.split(\":\")[-1].strip()\n",
    "                    if \"Descriptor:\" in line: \n",
    "                        if \"Social science\" in line:\n",
    "                            soc_sci = 1\n",
    "                        elif \"Statistics\" in line:\n",
    "                            stats = 1\n",
    "                        else: \n",
    "                            soc_sci = 0\n",
    "                            stats = 0\n",
    "                        if soc_sci == 0 and stats == 0:\n",
    "                            if \"Social sciences\" in lines[line_n + 1]:\n",
    "                                soc_sci = 1\n",
    "                            elif \"Statistics\" in lines[line_n + 1]:\n",
    "                                stats = 1\n",
    "                writer.writerow([title, auth, lang, year, start, end, soc_sci, stats])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Filter out publications with unidentified years and with duplicate entries.\n",
    "Records are incomplete and ambigious. Keep only those that have definite start and end dates."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-09-17T13:56:45.138745Z",
     "start_time": "2021-09-17T13:56:45.106325Z"
    }
   },
   "outputs": [],
   "source": [
    "df = pd.read_csv(ivs+\"19c_journals.csv\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-09-17T13:56:46.002620Z",
     "start_time": "2021-09-17T13:56:45.990483Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>title</th>\n",
       "      <th>auth</th>\n",
       "      <th>lang</th>\n",
       "      <th>year</th>\n",
       "      <th>start</th>\n",
       "      <th>end</th>\n",
       "      <th>soc</th>\n",
       "      <th>stats</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ProQuest statistical abstract of the United St...</td>\n",
       "      <td>United States.; Department of the Treasury.; B...</td>\n",
       "      <td>English</td>\n",
       "      <td>1878-2012</td>\n",
       "      <td>1878</td>\n",
       "      <td>2012</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>The World almanac and book of facts.</td>\n",
       "      <td>NaN</td>\n",
       "      <td>English</td>\n",
       "      <td>1868-</td>\n",
       "      <td>1868</td>\n",
       "      <td>1914</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>New Englander and Yale review; (DLC) 200721923...</td>\n",
       "      <td>Yale University. ; Rouben Mamoulian Collection...</td>\n",
       "      <td>English</td>\n",
       "      <td>1892-</td>\n",
       "      <td>1892</td>\n",
       "      <td>1914</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>The American journal of sociology.</td>\n",
       "      <td>University of Chicago.</td>\n",
       "      <td>English</td>\n",
       "      <td>1895-</td>\n",
       "      <td>1895</td>\n",
       "      <td>1914</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Annals of the American Academy of Political an...</td>\n",
       "      <td>American Academy of Political and Social Scien...</td>\n",
       "      <td>English</td>\n",
       "      <td>1890-</td>\n",
       "      <td>1890</td>\n",
       "      <td>1914</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                               title  \\\n",
       "0  ProQuest statistical abstract of the United St...   \n",
       "1               The World almanac and book of facts.   \n",
       "2  New Englander and Yale review; (DLC) 200721923...   \n",
       "3                 The American journal of sociology.   \n",
       "4  Annals of the American Academy of Political an...   \n",
       "\n",
       "                                                auth     lang       year  \\\n",
       "0  United States.; Department of the Treasury.; B...  English  1878-2012   \n",
       "1                                                NaN  English      1868-   \n",
       "2  Yale University. ; Rouben Mamoulian Collection...  English      1892-   \n",
       "3                             University of Chicago.  English      1895-   \n",
       "4  American Academy of Political and Social Scien...  English      1890-   \n",
       "\n",
       "   start   end  soc  stats  \n",
       "0   1878  2012    0      0  \n",
       "1   1868  1914    0      1  \n",
       "2   1892  1914    1      1  \n",
       "3   1895  1914    1      1  \n",
       "4   1890  1914    1      1  "
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-30T15:27:32.393627Z",
     "start_time": "2021-07-30T15:27:32.390384Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(8928, 8)"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-30T15:27:32.829129Z",
     "start_time": "2021-07-30T15:27:32.818795Z"
    }
   },
   "outputs": [],
   "source": [
    "df_filtered = df[df['start'] != -99]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-30T15:27:34.063352Z",
     "start_time": "2021-07-30T15:27:34.054833Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>title</th>\n",
       "      <th>auth</th>\n",
       "      <th>lang</th>\n",
       "      <th>year</th>\n",
       "      <th>start</th>\n",
       "      <th>end</th>\n",
       "      <th>soc</th>\n",
       "      <th>stats</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ProQuest statistical abstract of the United St...</td>\n",
       "      <td>United States.; Department of the Treasury.; B...</td>\n",
       "      <td>English</td>\n",
       "      <td>1878-2012</td>\n",
       "      <td>1878</td>\n",
       "      <td>2012</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>The World almanac and book of facts.</td>\n",
       "      <td>NaN</td>\n",
       "      <td>English</td>\n",
       "      <td>1868-</td>\n",
       "      <td>1868</td>\n",
       "      <td>1914</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>New Englander and Yale review; (DLC) 200721923...</td>\n",
       "      <td>Yale University. ; Rouben Mamoulian Collection...</td>\n",
       "      <td>English</td>\n",
       "      <td>1892-</td>\n",
       "      <td>1892</td>\n",
       "      <td>1914</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>The American journal of sociology.</td>\n",
       "      <td>University of Chicago.</td>\n",
       "      <td>English</td>\n",
       "      <td>1895-</td>\n",
       "      <td>1895</td>\n",
       "      <td>1914</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Annals of the American Academy of Political an...</td>\n",
       "      <td>American Academy of Political and Social Scien...</td>\n",
       "      <td>English</td>\n",
       "      <td>1890-</td>\n",
       "      <td>1890</td>\n",
       "      <td>1914</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                               title  \\\n",
       "0  ProQuest statistical abstract of the United St...   \n",
       "1               The World almanac and book of facts.   \n",
       "2  New Englander and Yale review; (DLC) 200721923...   \n",
       "3                 The American journal of sociology.   \n",
       "4  Annals of the American Academy of Political an...   \n",
       "\n",
       "                                                auth     lang       year  \\\n",
       "0  United States.; Department of the Treasury.; B...  English  1878-2012   \n",
       "1                                                NaN  English      1868-   \n",
       "2  Yale University. ; Rouben Mamoulian Collection...  English      1892-   \n",
       "3                             University of Chicago.  English      1895-   \n",
       "4  American Academy of Political and Social Scien...  English      1890-   \n",
       "\n",
       "   start   end  soc  stats  \n",
       "0   1878  2012    0      0  \n",
       "1   1868  1914    0      1  \n",
       "2   1892  1914    1      1  \n",
       "3   1895  1914    1      1  \n",
       "4   1890  1914    1      1  "
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_filtered.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-30T15:27:39.851860Z",
     "start_time": "2021-07-30T15:27:39.848933Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(6170, 8)"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_filtered.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-30T15:27:41.497851Z",
     "start_time": "2021-07-30T15:27:41.491323Z"
    }
   },
   "outputs": [],
   "source": [
    "df_filtered.drop_duplicates(subset=['title', 'auth', 'start'], \n",
    "                            keep='first', \n",
    "                            inplace=True, \n",
    "                            ignore_index=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-30T15:27:43.136994Z",
     "start_time": "2021-07-30T15:27:43.133802Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(5196, 8)"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_filtered.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-30T15:43:49.491777Z",
     "start_time": "2021-07-30T15:43:49.486359Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1912-?       41\n",
       "1878-        38\n",
       "1913-        35\n",
       "1905-        32\n",
       "1914-1948    30\n",
       "             ..\n",
       "1901-1921     1\n",
       "1858-1876     1\n",
       "1897-1986     1\n",
       "1910-1991     1\n",
       "1827-1960     1\n",
       "Name: year, Length: 1940, dtype: int64"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_filtered['year'].value_counts()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-30T15:27:46.876228Z",
     "start_time": "2021-07-30T15:27:46.835866Z"
    }
   },
   "outputs": [],
   "source": [
    "df_filtered.to_csv(ivs+\"19c_journals.csv\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create a data frame with count dicts"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Count how many stats and soc pubs per year"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-06-24T18:23:20.771429Z",
     "start_time": "2021-06-24T18:22:38.668204Z"
    }
   },
   "outputs": [],
   "source": [
    "stats_journals = {}\n",
    "soc_journals = {}\n",
    "for year in range(1803, 1915):\n",
    "    if year not in (stats_journals or soc_journals): \n",
    "        stats_journals[year] = 0\n",
    "        soc_journals[year] = 0\n",
    "    for index, journal in df_filtered.iterrows():\n",
    "        if journal['soc'] == 1: \n",
    "            if year in range(journal['start'], journal['end']):\n",
    "                soc_journals[year] += 1\n",
    "        if journal['stats'] == 1:\n",
    "            if year in range(journal['start'], journal['end']):\n",
    "                stats_journals[year] += 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Convert into a data frame"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-06-24T18:33:25.405863Z",
     "start_time": "2021-06-24T18:33:25.402295Z"
    },
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "years_df = pd.DataFrame.from_dict(stats_journals, \n",
    "                                  orient=\"index\", \n",
    "                                  columns=[\"stats_journals\"])\n",
    "\n",
    "years_df['year'] = years_df.index\n",
    "years_df.reset_index(drop=True, inplace=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-06-24T18:33:26.533434Z",
     "start_time": "2021-06-24T18:33:26.525904Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>stats_journals</th>\n",
       "      <th>year</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>14</td>\n",
       "      <td>1803</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>11</td>\n",
       "      <td>1804</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>11</td>\n",
       "      <td>1805</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>13</td>\n",
       "      <td>1806</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>13</td>\n",
       "      <td>1807</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   stats_journals  year\n",
       "0              14  1803\n",
       "1              11  1804\n",
       "2              11  1805\n",
       "3              13  1806\n",
       "4              13  1807"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "years_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Add soc journals"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-06-24T18:34:17.967891Z",
     "start_time": "2021-06-24T18:34:17.960822Z"
    }
   },
   "outputs": [],
   "source": [
    "years_df['soc_journals'] = years_df['year'].map(soc_journals)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-06-24T18:34:22.226651Z",
     "start_time": "2021-06-24T18:34:22.212580Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>stats_journals</th>\n",
       "      <th>year</th>\n",
       "      <th>soc_journals</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>14</td>\n",
       "      <td>1803</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>11</td>\n",
       "      <td>1804</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>11</td>\n",
       "      <td>1805</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>13</td>\n",
       "      <td>1806</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>13</td>\n",
       "      <td>1807</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>107</th>\n",
       "      <td>1991</td>\n",
       "      <td>1910</td>\n",
       "      <td>793</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>108</th>\n",
       "      <td>2061</td>\n",
       "      <td>1911</td>\n",
       "      <td>812</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>109</th>\n",
       "      <td>2138</td>\n",
       "      <td>1912</td>\n",
       "      <td>831</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>110</th>\n",
       "      <td>2152</td>\n",
       "      <td>1913</td>\n",
       "      <td>854</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>111</th>\n",
       "      <td>1169</td>\n",
       "      <td>1914</td>\n",
       "      <td>458</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>112 rows × 3 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "     stats_journals  year  soc_journals\n",
       "0                14  1803             4\n",
       "1                11  1804             4\n",
       "2                11  1805             4\n",
       "3                13  1806             5\n",
       "4                13  1807             5\n",
       "..              ...   ...           ...\n",
       "107            1991  1910           793\n",
       "108            2061  1911           812\n",
       "109            2138  1912           831\n",
       "110            2152  1913           854\n",
       "111            1169  1914           458\n",
       "\n",
       "[112 rows x 3 columns]"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "years_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "years_df.to_csv(ivs+\"19c_journals.csv\", index=False)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.4"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": true,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
